knitr document van Steensel lab
TF reporter cDNA-count processing - stimulation 1
Introduction
I previously processed the raw sequencing data, optimized the barcode clustering, quantified the pDNA data and normalized the cDNA data. In this script, I want to have a detailed look at the cDNA data from a general perspective.
Analysis
First insights into data distribution - reporter activity distribution plots
Heat map - display mean log2-activity for each TF in each condition
~10 of the 26 TFs in the library show promising activity at the first glimpse
Heatmap for native enhancers
# motfn=/home/f.comoglio/mydata/Annotations/TFDB/Curated_Natoli/update_2017/20170320_pwms_selected.meme
# odir=/home/m.trauernicht/mydata/projects/tf_activity_reporter/data/SuRE_TF_1/results/native-enhancer/fimo
# query=/home/m.trauernicht/mydata/projects/tf_activity_reporter/data/SuRE_TF_1/results/native-enhancer/cDNA_df_native.fasta
# nice -n 19 fimo --no-qvalue --thresh 1e-4 --verbosity 1 --o $odir $motfn $query Heatmap per TF - comparing design activities mutated vs. non-mutated
Heatmap per TF - only WT TF activities
Compute activity changes relative to their negative controls
All of these heatmaps conclude that there we have informative reporters for ~10 TFs, and that the TF reporter design matters for some but not all TFs
Log-linear expression modelling to explain variance - model for each TF
Log-linear expression modelling to explain variance - model for each TF - only WT - without condition
Make the same models as before - but now per TF and per condition
Session Info
## [1] "Run time: 1.875579 mins"
## [1] "/DATA/usr/m.trauernicht/projects/SuRE-TF/gen-1_stimulation-1"
## [1] "Wed Nov 25 16:39:44 2020"
## R version 3.6.3 (2020-02-29)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.7 LTS
##
## Matrix products: default
## BLAS: /usr/lib/libblas/libblas.so.3.6.0
## LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] gridExtra_2.3 cowplot_1.0.0 plyr_1.8.6 viridis_0.5.1
## [5] viridisLite_0.3.0 ggforce_0.3.1 ggbeeswarm_0.6.0 ggpubr_0.2.5
## [9] magrittr_1.5 pheatmap_1.0.12 tibble_3.0.1 maditr_0.6.3
## [13] dplyr_0.8.5 ggplot2_3.3.0 RColorBrewer_1.1-2
##
## loaded via a namespace (and not attached):
## [1] prettydoc_0.4.0 beeswarm_0.2.3 tidyselect_1.1.0 xfun_0.19
## [5] purrr_0.3.3 splines_3.6.3 lattice_0.20-38 colorspace_1.4-1
## [9] vctrs_0.2.4 htmltools_0.5.0 yaml_2.2.1 mgcv_1.8-31
## [13] rlang_0.4.8 pillar_1.4.3 glue_1.4.2 withr_2.1.2
## [17] tweenr_1.0.1 lifecycle_0.2.0 stringr_1.4.0 munsell_0.5.0
## [21] ggsignif_0.6.0 gtable_0.3.0 evaluate_0.14 labeling_0.3
## [25] knitr_1.30 vipor_0.4.5 Rcpp_1.0.5 scales_1.1.0
## [29] farver_2.0.1 digest_0.6.27 stringi_1.5.3 polyclip_1.10-0
## [33] grid_3.6.3 tools_3.6.3 crayon_1.3.4 pkgconfig_2.0.3
## [37] Matrix_1.2-18 ellipsis_0.3.0 MASS_7.3-51.5 data.table_1.12.8
## [41] assertthat_0.2.1 rmarkdown_2.5 R6_2.5.0 nlme_3.1-143
## [45] compiler_3.6.3